Improving Prediction of Speech Activity Using Multi-Participant Respiratory State
نویسندگان
چکیده
One consequence of situated face-to-face conversation is the coobservability of participants’ respiratory movements and sounds. We explore whether this information can be exploited in predicting incipient speech activity. Using a methodology called stochastic turn-taking modeling, we compare the performance of a model trained on speech activity alone to one additionally trained on static and dynamic lung volume features. The methodology permits automatic discovery of temporal dependencies across participants and feature types. Our experiments show that respiratory information substantially lowers cross-entropy rates, and that this generalizes to unseen data.
منابع مشابه
Floor holder detection and end of speaker turn prediction in meetings
We propose a novel fully automatic framework to detect which meeting participant is currently holding the conversational floor and when the current speaker turn is going to finish. Two sets of experiments were conducted on a large collection of multiparty conversations: the AMI meeting corpus. Unsupervised speaker turn detection was performed by post-processing the speaker diarization and the s...
متن کاملSpeech quality improvement of a multi-pulse speech codec with pitch prediction on a single chip signal processor
The multi-pulse speech coding with pitch prediction has been known as an efficient speech coding method. In this paper, a new pulse search method is proposed for improving speech quality with small amount of computation. Characteristics ofthispulse search method are listed below. 1. Modifying pulse amplitude in pulse search loop. 2. Controlling pulse search conditions. 3. Quantization of pulse ...
متن کاملComparing Parallel Simulated Annealing, Parallel Vibrating Damp Optimization and Genetic Algorithm for Joint Redundancy-Availability Problems in a Series-Parallel System with Multi-State Components
In this paper, we study different methods of solving joint redundancy-availability optimization for series-parallel systems with multi-state components. We analyzed various effective factors on system availability in order to determine the optimum number and version of components in each sub-system and consider the effects of improving failure rates of each component in each sub-system and impr...
متن کاملPrediction of Gain in LD-CELP Using Hybrid Genetic/PSO-Neural Models
In this paper, the gain in LD-CELP speech coding algorithm is predicted using three neural models, that are equipped by genetic and particle swarm optimization (PSO) algorithms to optimize the structure and parameters of neural networks. Elman, multi-layer perceptron (MLP) and fuzzy ARTMAP are the candidate neural models. The optimized number of nodes in the first and second hidden layers of El...
متن کاملPrediction of Gain in LD-CELP Using Hybrid Genetic/PSO-Neural Models
In this paper, the gain in LD-CELP speech coding algorithm is predicted using three neural models, that are equipped by genetic and particle swarm optimization (PSO) algorithms to optimize the structure and parameters of neural networks. Elman, multi-layer perceptron (MLP) and fuzzy ARTMAP are the candidate neural models. The optimized number of nodes in the first and second hidden layers of El...
متن کامل